Lab 2 - Indicators

Author

Nissim Lebovits

Published

September 20, 2023

Executive Summary

  • Define TOD
  • TOD gaining national prominence
  • Explain that we have looked at the relationship between proximity to transit and rent prices
  • In San Francisco, proximity to transit appears to be inversely related to median rent prices. This suggests that amenities other than transit access may exert a stronger influence over renters’ willingness to pay higher prices.

Introduction

  • What is TOD?
  • What data are we looking at?
  • Why do we care?

Research Overview

  • What data are we looking at? Sources, years, etc.
  • What questions will we try to ansewr?

Discussion and Analysis

  • What are we seeing in the data? Spatial or temporal trends?

Conclusion or Recommendation

Intro

Prepare a policy brief for the local City Council representatives. Do households value transit-rich neighborhoods compared to others? How certain can you be about your conclusions given some of the spatial biases we’ve discussed? You must choose a city with open transit station data. (You may do this analysis using data from the 2000 decennial census - as in the book - as the first time period in the analysis OR you may use 2009 ACS 5-year estimates. You may use ACS data of a recent year in place of 2017 if you wish).

Prepare an accessible (non-technical) R markdown document with the following deliverables. Provide a brief motivation at the beginning, annotate each visualization appropriately, and then provide brief policy-relevant conclusions. Please do not suppress code blocks - but you should fold them to make your assignment more legible. Here are the specific deliverables:

Show your data wrangling work.

  1. Four small-multiple (2000 /2009 & 2017+) visualizations comparing four selected Census variables across time and space (TOD vs. non-TOD).
  2. One grouped bar plot making these same comparisons.
  3. One table making these same comparisons.
  4. Create two graduated symbol maps of population and rent within 0.5 mile of each transit station. Google for more information, but a graduate symbol map represents quantities for each transit station proportionally.
  5. Create a geom_line plot that shows mean rent as a function of distance to subway stations (Figure 1.17). To do this you will need to use the multipleRingBuffer function found in the functions.R script.
Show the code
library(tidyverse)
library(tidycensus)
library(sf)
library(ggthemr)
library(kableExtra)
library(tmap)
library(janitor)
library(sfdep)
library(arcpullr)

source("https://raw.githubusercontent.com/urbanSpatial/Public-Policy-Analytics-Landing/master/functions.r")

options(tigris_use_cache = TRUE, scipen = 999)
tmap_mode('view')
ggthemr('flat')

state <- "CA"
county <- "San Francisco"
crs <- 'EPSG:7132' # NAD 1983 us feet for san fran
acs_vars <- c("B25026_001E",
              "B02001_002E",
              "B15001_050E",
              "B15001_009E",
              "B19013_001E", 
              "B25058_001E",
              "B06012_002E")

palette5 <- c("#f0f9e8","#bae4bc","#7bccc4","#43a2ca","#0868ac")
Show the code
get_vars <- function(year) {
  suppressMessages(
      tracts10 <- get_acs(geography = "tract",
                          variables = acs_vars, 
                          year=year, 
                          state=state,
                          county=county, 
                          geometry=TRUE,
                          output = "wide") %>% 
                st_transform(crs = crs) %>%
                rename(TotalPop = B25026_001E, 
                       Whites = B02001_002E,
                       FemaleBachelors = B15001_050E, 
                       MaleBachelors = B15001_009E,
                       MedHHInc = B19013_001E, 
                       MedRent = B25058_001E,
                       TotalPoverty = B06012_002E) %>%
                mutate(pctWhite = ifelse(TotalPop > 0, Whites / TotalPop, 0),
                       pctBachelors = ifelse(TotalPop > 0, ((FemaleBachelors + MaleBachelors) / TotalPop), 0),
                       pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
                       year = as.character(year)) %>%
                dplyr::select(-Whites, -FemaleBachelors, -MaleBachelors, -TotalPoverty, -ends_with("M")) %>%
                filter(!st_is_empty(geometry), ## there's a tract with an empty geom; drop it
                        GEOID != "06075980401") # %>% # drop Alcatraz
                # mutate(nb = as.character(st_contiguity(geometry))) %>%
                # filter(nb != 0) #filter out tracts not on the mainland (i.e., w no contiguous neighbors)
      |> 
  mutate(
    nb = st_knn(geometry, 5),
    wt = st_weights(nb),
    MedRent = ifelse(is.na(MedRent), purrr::map_dbl(find_xj(MedRent, nb), mean, na.rm = TRUE), MedRent))
                )
}

tracts10 <- get_vars(2010)
tracts20 <- get_vars(2020)

allTracts <- rbind(tracts10, tracts20)
allTracts <- allTracts %>%
              mutate(medRent = ifelse(year == "2010", MedRent * 1.19, MedRent)) #per BLS: https://data.bls.gov/cgi-bin/cpicalc.pl?cost1=1&year1=201001&year2=202001
Show the code
tm_shape(allTracts) +
  tm_polygons(col = "TotalPop", alpha = 0.7, border.alpha = 0, style = "quantile", palette = palette5) +
  tm_layout(frame = FALSE) +
  tm_facets(by = "year", ncol = 2)
Show the code
### 2010
rent10 <- tracts10["MedRent"]
pop10 <- tracts10["TotalPop"]

tracts10RentInterp <- st_interpolate_aw(rent10, sfmta_rail_stops_buffer, extensive = FALSE)
tracts10PopInterp <- st_interpolate_aw(pop10, sfmta_rail_stops_buffer, extensive = TRUE)

tracts10Interp <- as.data.frame(sfmta_rail_stops_buffer$stop_id)
tracts10Interp$medRent <- tracts10RentInterp$MedRent
tracts10Interp$totPop <- tracts10PopInterp$TotalPop
tracts10Interp$geometry <- sfmta_rail_stops_buffer$geoms
tracts10Interp <- tracts10Interp %>% st_as_sf(crs = crs)

t10iRent <- tm_shape(tracts10Interp) +
  tm_dots(col = "medRent", size = "medRent", border.alpha = 0, alpha = 0.5, palette = palette5, style = "quantile") +
  tm_view(view.legend.position = c("left", "bottom"))

t10iPop <- tm_shape(tracts10Interp) +
  tm_dots(col = "totPop", size = "totPop", border.alpha = 0, alpha = 0.5, palette = palette5, style = "quantile") +
  tm_view(view.legend.position = c("left", "bottom"))

### 2020
rent20 <- tracts20["MedRent"]
pop20 <- tracts20["TotalPop"]

tracts20RentInterp <- st_interpolate_aw(rent20, sfmta_rail_stops_buffer, extensive = FALSE)
tracts20PopInterp <- st_interpolate_aw(pop20, sfmta_rail_stops_buffer, extensive = TRUE)

tracts20Interp <- as.data.frame(sfmta_rail_stops_buffer$stop_id)
tracts20Interp$medRent <- tracts20RentInterp$MedRent
tracts20Interp$totPop <- tracts20PopInterp$TotalPop
tracts20Interp$geometry <- sfmta_rail_stops_buffer$geoms
tracts20Interp <- tracts20Interp %>% st_as_sf(crs = crs)

t20iRent <- tm_shape(tracts20Interp) +
  tm_dots(col = "medRent", size = "medRent", border.alpha = 0, alpha = 0.5, palette = palette5, style = "quantile") +
  tm_view(view.legend.position = c("left", "bottom"))

t20iPop <- tm_shape(tracts20Interp) +
  tm_dots(col = "totPop", size = "totPop", border.alpha = 0, alpha = 0.5, palette = palette5, style = "quantile") +
  tm_view(view.legend.position = c("left", "bottom"))


tmap_arrange(t10iPop, t20iPop)
Show the code
tmap_arrange(t10iRent, t20iRent)
Show the code
tm_shape(sfmta_rail_stops_buffer_union) +
  tm_polygons(col = palette5[3], alpha = 0.5, border.alpha = 0) +
tm_shape(sfmta_rail_routes) +
  tm_lines() +
tm_shape(sfmta_rail_stops) +
  tm_dots() + 
  tm_scale_bar(position=c("left", "bottom"))

There must be a better way to do this filtering

Show the code
suppressWarnings(
selectCentroids <- st_centroid(allTracts)[sfmta_rail_stops_buffer_union, ] %>%
                    st_drop_geometry() %>%
                    left_join(., dplyr::select(allTracts, GEOID), by = "GEOID") %>%
                    distinct(GEOID, year, .keep_all = TRUE) %>%
                    st_sf() %>%
                    mutate(TOD = "TOD")
)

suppressWarnings(
  antiSelectCentroids <- st_centroid(allTracts)[sfmta_rail_stops_buffer_union, op = st_disjoint] %>%
                        st_drop_geometry() %>%
                        left_join(., dplyr::select(allTracts, GEOID), by = "GEOID") %>% 
                        distinct(GEOID, year, .keep_all = TRUE) %>%
                        st_sf() %>%
                        mutate(TOD = "Non-TOD")
)

tractsGroup <- rbind(selectCentroids, antiSelectCentroids)
Show the code
tm_shape(tractsGroup) +
  tm_polygons(alpha = 0.4, border.col = "white", col = "TOD", palette = c(palette5[4], palette5[1])) +
tm_facets(by = "year")

(Need to add captions, etc. here.)

Show the code
tm_shape(tractsGroup) +
  tm_polygons(alpha = 0.4, border.col = "white", col = "medRent", style = "quantile", palette = palette5) +
tm_facets(by = "year") +
tm_shape(sfmta_rail_routes) +
  tm_lines() +
tm_shape(sfmta_rail_stops) +
  tm_dots() + 
  tm_scale_bar(position=c("left", "bottom")) 
Show the code
allTracts.Summary <- 
  st_drop_geometry(tractsGroup) %>%
    group_by(year, TOD) %>%
    summarize(Rent = mean(MedRent, na.rm = T),
              Population = mean(TotalPop, na.rm = T),
              Percent_White = mean(pctWhite, na.rm = T),
              Percent_Bach = mean(pctBachelors, na.rm = T),
              Percent_Poverty = mean(pctPoverty, na.rm = T))

kable(allTracts.Summary) %>%
  kable_styling() %>%
  footnote(general_title = "\n",
           general = "Table 1.2")
year TOD Rent Population Percent_White Percent_Bach Percent_Poverty
2010 Non-TOD 1369.455 3958.687 0.5424141 0.0265217 0.1064100
2010 TOD 1289.979 3984.177 0.5337059 0.0240804 0.1432555
2020 Non-TOD 2036.272 3519.310 0.4862315 0.0233500 0.1032378
2020 TOD 1946.578 3571.000 0.4644013 0.0284417 0.1215223

Table 1.2
Show the code
allTracts.Summary %>%
  rename(
    pctBach = Percent_Bach,
    pctPov = Percent_Poverty,
    pctWht = Percent_White,
    pop = Population,
    rent = Rent
  ) %>%
  gather(Variable, Value, -year, -TOD) %>%
  ggplot(aes(year, Value, fill = TOD)) +
    geom_bar(stat = "identity", position = "dodge") +
    facet_wrap(~Variable, scales = "free", ncol=3) +
    scale_fill_manual(values = c("#bae4bc", "#0868ac")) +
    labs(title = "Indicator differences across time and space") +
    plotTheme() + 
  theme(legend.position="bottom",
        aspect.ratio = 1)

Show the code
allTracts <- allTracts %>%
  rowwise() %>%
  mutate(
    distToRail = as.numeric(min(st_distance(st_centroid(geometry), sfmta_rail_stops$geoms))),
    milesToRail = as.character(round((distToRail / 5280), 1))
  ) %>%
  ungroup()

rentXDist <- allTracts %>%
        st_drop_geometry() %>%
        group_by(year, milesToRail) %>%
        summarize(
          avgRent = mean(MedRent, na.rm = TRUE)
        ) %>%
        ungroup()

ggplot(rentXDist, aes(x = as.numeric(milesToRail), y = avgRent, color = year)) +
  geom_point() +
  geom_line() + 
  geom_smooth(method = "lm", se = FALSE, linetype = "dashed", linewidth = 0.5) +
  labs(title = "Avg. Median Rent by Distance to Transit",
       subtitle = "San Francisco, 2010 to 2020",
       x = "Distance to Nearest Transit Stop (Miles)",
       y = "Average Median Rent",
       color = "Year") +
  scale_y_continuous(limits = c(0, NA))